Feature Selection for Optimized High-dimensional Biomedical Data using the Improved Shuffed Frog Leaping Algorithm.

نویسندگان

  • Bin Hu
  • Yongqiang Dai
  • Yun Su
  • Philip Moore
  • Xiaowei Zhang
  • Chengsheng Mao
  • Jing Chen
  • Lixin Xu
چکیده

High dimensional biomedical datasets contain thousands of features which can be used in molecular diagnosis of disease, however, such datasets contain many irrelevant or weak correlation features which influence the predictive accuracy of diagnosis. Without a feature selection algorithm, it is difficult for the existing classification techniques to accurately identify patterns in the features. The purpose of feature selection is to not only identify a feature subset from an original set of features [without reducing the predictive accuracy of classification algorithm] but also reduce the computation overhead in data mining. In this paper, we present our improved shuffled frog leaping algorithm which introduces a chaos memory weight factor, an absolute balance group strategy and an adaptive transfer factor. Our proposed approach explores the space of possible subsets to obtain the set of features that maximizes the predictive accuracy and minimizes irrelevant features in high-dimensional biomedical data. To evaluate the effectiveness of our proposed method we have employed the K-nearest neighbor method with a comparative analysis in which we compare our proposed approach with genetic algorithms, particle swarm optimization, and the shuffled frog leaping algorithm. Experimental results show that our improved algorithm achieves improvements in the identification of relevant subsets and in classification accuracy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improved Frog Leaping Algorithm Using Cellular Learning Automata

In this paper, a new algorithm which is the result of the combination of cellular learning automata and frog leap algorithm (SFLA) is proposed for optimization in continuous, static environments.At the proposed algorithm, each memeplex of frogs is placed in a cell of cellular learning automata. Learning automata in each cell acts as the brain of memeplex, and will determine the strategy of moti...

متن کامل

Combined Heat and Power Economic Dispatch using Improved Shuffled Frog Leaping Algorithm

Recently, Combined Heat and Power (CHP) systems have been utilized increasingly in power systems. With the addition penetration of CHP-based co-generation of electricity and heat, the determination of economic dispatch of power and heat becomes a more complex and challenging issue. The optimal operation of CHP-based systems is inherently a nonlinear and non-convex optimization problem with a lo...

متن کامل

Use of the Improved Frog-Leaping Algorithm in Data Clustering

Clustering is one of the known techniques in the field of data mining where data with similar properties is within the set of categories. K-means algorithm is one the simplest clustering algorithms which have disadvantages sensitive to initial values of the clusters and converging to the local optimum. In recent years, several algorithms are provided based on evolutionary algorithms for cluster...

متن کامل

SFLA Based Gene Selection Approach for Improving Cancer Classification Accuracy

 In this paper, we propose a new gene selection algorithm based on Shuffled Frog Leaping Algorithm that is called SFLA-FS. The proposed algorithm is used for improving cancer classification accuracy. Most of the biological datasets such as cancer datasets have a large number of genes and few samples. However, most of these genes are not usable in some tasks for example in cancer classification....

متن کامل

Genetic and Improved Shuffled Frog Leaping Algorithms for a 2-Stage Model of a Hub Covering Location Network

    Hub covering location problem, Network design,   Single machine scheduling, Genetic algorithm,   Shuffled frog leaping algorithm   Hub location problems (HLP) are synthetic optimization problems that appears in telecommunication and transportation networks where nodes send and receive commodities (i.e., data transmissions, passengers transportation, express packages, postal deliveries, etc....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEEE/ACM transactions on computational biology and bioinformatics

دوره   شماره 

صفحات  -

تاریخ انتشار 2016